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ABSTRACT 

Context. The length of the asteroseismic timeseries obtained from the Kepler satellite analysed here span 19 months. Kepler provides 
the longest continuous timeseries currently available, which calls for a study of the influence of the increased timespan on the accuracy 
and precision of the obtained results. 

Aims. We aim to investigate how the increased timespan influences the detectability of the oscillation modes, and the absolute values 
and uncertainties of the global oscillation parameters, i.e., frequency of maximum oscillation power, Vmax, and large frequency sepa- 
ration between modes of the same degree and consecutive orders, (Av). 

Methods. We use published methods to derive Vn,ax and (Av> for timeseries ranging from 50 to 600 days and compare these results as 
a function of method, timespan and (Av). 

Results. We find that in general a minimum of the order of 400 day long timeseries are necessary to obtain reliable results for the 
global oscillation parameters in more than 95% of the stars, but this does depend on (Ay>. In a statistical sense the quoted uncertainties 
seem to provide a reasonable indication of the precision of the obtained results in short (50-day) runs, they do however seem to be 
overestimated for results of longer runs. Furthermore, the different definitions of the global parameters used in the different methods 
have non-negligible effects on the obtained values. Additionally, we show that there is a correlation between y^ax and the flux variance. 
Conclusions. We conclude that longer timeseries improve the likelihood to detect oscillations with automated codes (from ~60% in 
50 day runs to > 95% in 400 day runs with a slight method dependence) and the precision of the obtained global oscillation param- 
eters. The trends suggest that the improvement will continue for even longer timeseries than the 600 days considered here, with a 
reduction in the median absolute deviation of more than a factor of 10 for an increase in timespan from 50 to 2000 days (the currently 
foreseen length of the mission). This work shows that global parameters determined with high precision - thus from long datasets - 
using dilferent definitions can be used to identify the evolutionary state of the stars. 

Key words, stars: red giants - stars: oscillations - stars: interior - techniques: photometric 



1. Introduction 

Many breakthrough results for red-giant (G- K) stars have been 
presented using data obtained by the CoRo T (iBaglin et alJ2006l) 
and NASA Kepler (iBorucki et al.1 1201 Ol) missions. These re- 
sults include statistical ensemble studies of global oscillation pa- 
rameters, i.e., frequency of maximum oscillation power, Vmax, 
mean frequency separation between modes of the same de- 
gree and consecutive orders, (Av), small frequency separa- 
tions between modes of different degree, £, amplitudes and 
visibi hties of the oscillations, and tests of s caling r elations 
(e.g.. iDe Ridder et all 120091: iHekker et alj|2009^ .Bedding et alj 



Send offprint requests to: S. Hekker, 
email: S.Hekker@uva.nl 

* Values of the global oscillation parameters can be obtained from 
the authors upon request. 



2010; 'Huberet al."201(y; 'He kker etal.ll2011dt iHuber et al.ll20TTt 
Mosser et al.. 2012) . Additionally, it has been p ossible to deter- 
mine s tellar parameters such as masses and radii (iKallinger et al.l 
I2010biy) . In addition to these results, asteroseismic investiga- 
tions into_die_granulatiqn( Mathur et al. 201 1), red giants in clus- 
ters dBasu et al.ll201 itlHekker et a l. 201 lb; Stello et al. 201 lb aD 
and red giants in eclipsing binaries ( Hekker et al.i,2010h) have 
been performed, as well as detailed i nvestigations into the 
internal structure of single stars (e.g., iDi Mauro et al.l 1201 ll; 
Jiang et al. 2011; Baudinetal. 2012). The Kepler results re- 
ferred to are based on timeseries with a near regular cadence of 
either 29.4 min or 58.85 s and a timespan ranging from ~30 days 
up to more than 1 .5 yr. These are the first datasets from space- 
based telescopes with such long timespan and high fill (>90%) 
and frequency resolution (w 0.019 pHz). Underpinning much of 
this work is the ability to determine global oscillation param- 
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eters and the uncertainties in these values. It is reasonable to 
ask if there are now enough data available and whether there 

30 are any gains to be obtained from observing individual stars for 
longer periods. In this paper we address the precision and relia- 
bility of the determination of some of the global seismic param- 
eters. There are other areas where there is a clear need for data of 
longer duration because the features detected in the power spec- 

35 tra are narrow and hence barely resolved even by the current 
datasets. In particular, w e highlight the detection of g-p mixed 
modes ( Beck et al ] l2011h . The observed mean period spacings 
appear to have different values for stars that b urn only H (in 
a shell) and those that als o burn He in the core (Beddi ng et alj 

40 1201 itlMosser et al.l201 lah . hence the period spacing can be used 
to distinguish between different evolutionary states in which red 
giants are observed using the characteristics of their frequency 
spectra. Another method to distinguish between different evo- 
lutionary phases is bas ed on the difference in frequency depen- 

45 dence of radial modes (iKallinger et al.i,2012) . Furthermore, re- 
cently, the timeseries obtained with Kepler have become long 
enough to study rotational splitting of the oscillation modes, 
which led to the detection of differential rotation in red giants 
(lBecketal.ll2Q12h . 

50 In this work, we use the 19 months of data available from QO 
to Q7 to investigate how the increased timespan influences the 
detectability of the oscillation modes, and the absolute values 
and uncertainties of the global oscillation parameters, v^ax and 
(Av). These are important in several ways. Knowing the depen- 

55 dence of the precision on data duration is a guide for observing 
strategies, and for the determination of those secondary param- 
eters that are derived from the primary global oscillation param- 
eters, such as stellar mass and radius. Furthermore, it is crucial 
to be able to estimate the proportion of false negatives and false 

60 positives for population studies. Also, for detailed modelling of 
individual oscillation frequencies Vmax turned out to be of great 
diagnostic potential ( Gruberbauer et al]l2012h . We will include 
in our considerations the impact of other relevant parameters 
such as the observed height-to-backgroun d ratio of the oscilla- 

65 tion excess. This work is a follow-up of Hekker e t alJ (1201 id 
hereafter paper I) on the red giants and Verner et al] (1201 ih on 
solar-type stars, in which results obtained with different meth- 
ods have been compared and vahdated. 

Paper I described the comparison of global oscillation pa- 

70 rameters extracted from about four month of Kepler data using 
different methods. From this comparison, it was concluded that 
1) the results from the different methods agree for most stars 
within a few percent; 2) at least five methods (out of the seven 
tested) obtained results for 92% of stars for Vmax within the range 

75 of 50 yuHz to 170 juHz, and this percentage decreased to 69% 
when all stars with v^ax covering the complete frequency range, 
i.e., - 283.4yL(Hz (the Nyquist frequency) were included; 3) the 
scatter due to realization noise, originating from the stochastic 
nature of the oscillations, is non-negligible and can be at least 

80 as important as the internal uncertainty of the results due to the 
method used, but this depends on the frequency of maximum 
oscillation power, v^ax, and on the methods. In case a model is 
used to describe the variation of Av with frequency the results are 
less sensitive to realization noise than others; 4) the influence of 

85 the obtained value of (Av) is less dependent on the frequency 
range over which it is computed than is the case for solar-type 
stars. A theor etical follow-up study to explain the latter has been 
performed bv lHekker et al.l (1201 lah . 
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Fig. 1. Distribution of the mean large frequency separations of 
the stars in our sample. 



2. Data 

For the current study, we use data obtained with the Kepler 90 
satellite during its first ~19 months of operation (QO-7). These 
data have a ~29.4 minute near regular cadence and have 
been corrected for possible artifacts in the wa y described by 
iGarcia et al.1 (1201 lb . See e.g. lJenkins et al.1 (1201 Ol) for some char- 
acteristics of these data. The stars in the sample investigated 95 
here have been selected for asteroseismic investigations by the 
Kepler Asteroseismic Science Consortium (KASC) or for as- 
trometric purposes. We exclude cluster stars from this sample. 
Additionally, we include only stars for which a power excess 
characteristic for stochastic oscillations is detected. In some 100 
cases the stars episodically fall on the one CCD that has gone in- 
active, resulting in loss of data. We exclude these stars from our 
current investigation. Other causes of data loss are safe mode 
and momentum dumping from the spacecraft, as well as data 
downlinks every ~30 days. These result in rather smaller losses 105 
of data. We require that the stars have been observed in all avail- 
able quarters and we accept a fill level down to 94% accounting 
for some additional loss of data. This then leaves us with 1028 
stars. 

The (Av) distribution of stars in the dataset we consider here 110 
is shown in Fig.[T] This distribution is similar to th e ones seen in 
other p ublished work on the Kepler red giants (e.g.. lHekker et alJ 
[20TTdl) . 



3. Parameter extraction 

For the data analysis, all the methods used here are based 115 
on a subset of tho se descri bed in paper I, i.e . COR 
(Mosser & ApDourchaux 2009; Mosseretal. 2011b), OCT 
(Hekker et al. 2010a.) and CAN (Kallinger et al, .2010 a). For 
Vmia. values we have used the autocorrelatio n function from 
COR:EACF (Mosser & Appour chauxl l2009l) . and the cen- 120 
tre of the Gaussian fit to the oscilla t ion po wer excess 
from OCT (method II in IHekkeret all 1201 Oah and CAN 
dKallinger et al.l l2010al) . For (Av) we use the autocorrelation 
method (COR:EACF, [Mosser & Appourchaux 2009) and the 
universal pattern (COR:UP, Mos ser et al.ll2011b ) as well as the 125 
determination of the peak in the power spectrum of the power 
spectrum using statistics of grouped data OCT:PS®PS and 
with the addition of bayesian statistics OCT:PS(8iPS (bayesian) 
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(iHekker et alJl201 0a [). and finally, fitting of the central three ra- 

130 dial orders (CAN.'Kallinger et al .ll2010al) . 

A homogeneous comparison between the values of the 
shorter timeseries as presented in paper I, and of longer time- 
series cannot be performed directly, as continuous improvements 
to the methods have been made. These improvements have been 

1 35 made as a result of our increasing knowledge of the data from 
earlier runs and to deal with the longer timeseries. The changes 
are of numerical nature and do not alter the underlying princi- 
ples of the methods. Hence, the references cited above are still 
valid. To perform a uniform study of the impact of the length of 

140 the timeseries, the (Q0-Q7) dataset (~600 days) has been used 
both as a whole and divided into subsets. These datasets are all 
analysed with the latest versions of the analysis methods. 

4. Likelihood of detecting oscillation power in 
frequency spectra 

1 45 Recently, iHekker et al.l (1201 Idh analysed one-month data sets of 
publicly available data for over 16 000 red giants selected on the 
basis of effective temperature and surface gravity. They found 
that in ~70% of the stars, oscillations could be detected. This 
raises questions as to whether this fraction is telling us some- 

150 thing about the ability of red giants to sustain stochastically- 
driven oscillations, or if it is just a reflection of the difficulties in 
the automated detections of oscillations for relatively short data 
sets? Perhaps also, some of the stars were so faint that their noise 
levels prevented the oscillations being detected. Alternatively, 

155 can some other feature in the star suppress the oscillations in 
the same manner as activity is known to suppress the oscilla- 
tions in solar- like stars (iMosser et al.ll2009l:IChaplin et al.l201 lat 
lHubere"tani201lh ? Here we will first give consideration to the 
importance of the ampUtude of the oscillations and apparent 

160 brightness of the stars and we will subsequently consider the 
problems associated with the automated methods. 

We consider how we might estimate the likelihood of detect- 
ing oscillation power when it i s present in the data. We use the 
same method as given in (Chap Unet al.ll20lTbl) . adapted for the 

1 65 red giants, to show that there is a high expectation that we will 
be able to detect the modes of oscillations in all the red giants 
in the Kepler data set. It is important to note that, although the 
correct identification of the frequency range in which the modes 
are located is of fundamental importance, most current methods 

1 70 do not use this as their primary consideration when determin- 
ing if there is oscillation power in the spectrum. For most of the 
methods, the determination of Av is done first. If this fails then 
'no detection' is reported. This may not be the best strategy, but 
before we construct that discussion we should first explore the 

1 75 existing predictions for the amplitudes of the modes of oscilla- 
tions in red giants. 

4. 1 . Prediction for mode power 

iKield sen & Beddind (Il995h devised scaling relations predict- 
ing that the amplitude of solar-like oscillations scale with 

180 their luminosity to mass ratio, which implies that the ampli- 
tude of the oscillations increases with increasing stellar ra- 
dius. Hence, solar-like oscillations in red giants are expected to 
have higher amplitudes than oscillations in solar-type stars of 
equal masses. These scaling relations have recently been revised 

185 (Kjeldsen & Bedding 2011), and also tested, both theoretically 
(e.g.lSam adi et al. 200 7D an d observati onally ( e.g. iBaudin et al.l 
l2011allbHHuberetal...20lTt IStello et al.ll20Tlah . 



To determine if it is possible to detect the modes, we are in- 
terested in the signal-to-noise ratio in the vicinity of the modes. 
In Mosser et al. (2012) it was shown that, with a small de- 190 
pendence on evolutionary status, the ratio of the height of the 
smoothed power spectrum to the granulation noise background 
evaluated at v^ax is between 3.7 and 4.0 for clump and red-giant 
branch stars, respectively. Accordingly, we will use the lower 
limit of this to work out the signal-to-noise in the integrated 195 
spectral power excess. For all the red giants that we consider 
here, the intrinsic photon shot noise is negligible and we neglect 
it. This removes a consideration of the stellar luminosity from 
the calculations. 

A commonly accepted model of the envelope of the oscil- 200 
lation power is a Gaussian function whose width, Wenv, scales 
with the f requency of maximum power Vmax as Wenv - O.SQvjJjax 
(iMosser e t al. 2010). We can determine the average power in the 
oscillations by smoothing the power spectrum over a range of at 
least one large spacing so that no trace of the individual modes 205 
remains. It is recognized that doing this in practice requires con- 
siderable care as is spelt out in lMosser et al.l(l2012h . We will take 
twice the full-width half-maximum of the underlying Gaussian 
as the range over which we will integrate to determine the av- 
erage power. This range contains all but a few per cent of the 210 
oscillation power. The granulation background in the vicinity of 
the modes can be modeled with a power law with index of -2.1 
(Mosser et al. 2012). Integration of these two functions, over the 
same range, leads to an integrated height-to-background ratio 
(///Bint ) of 1 .55 (0.42 the height-to-background ratio at Vmax)- 21 5 

The averaging of the data during the sampling interval in the 
time domain causes an attenuation of the amplitudes in the fre- 
quency spectrum according to a sine function and we can use the 
ratio of v^ax to VNyq, the Nyquist frequency which is 283.4 yuHz 
for the Kepler long cadence data, to quantify the size of this re- 220 
duction. The majority of stars in our sample have v„iax v^yq 
and for these stars this sine term is negligible and we do not 
consider it further. 

4.2. Model of detection probability 

Here we present a model, based on predicted integrated height- 225 
to-background in the vicinity of the oscillations, for how de- 
tectable the os cillations are. To do th is we adapt the formula- 
tion devised bv lChapUn et al.l (1201 Ibl) for solar- type stars to red 
giants. The principle of the method is to compare the power 
present in the modes with that present in the background and 230 
then to use probability distributions to ascertain the likelihood 
of the mode power being reliably detected. 

The question that we now wish to answer is 'given the ///Bint 
what is the chance of a false detection?'. We set a probability, 
/"false, at which we are prepared to risk a false positive detec- 235 
tion. In general, this level should be low. Typically for this work 
we hav e used pMse-O.Ol (i.e.1%). As detailed in Chaplin et aT| 
(1201 Ibh . we compute a threshold value, 6, in a distribution 
with 2n degrees of freedom (dof) such that the probability that 
some random variable is greater than the threshold value sup- 240 
plied is equal to pfaise- In this, we take «, the number of degrees 
of freedom, as the number of independent frequency bins used 
to compute H/Bint. We must also take account of the chance that, 
because of random noise in the data, we will miss a true detec- 
tion for a star with sufficient signal-to-noise for detection which 245 
leads to a new threshold value 02. 

6+1 

Oi = ■ (1) 

///Bint +1 
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Fig. 2. Fraction of runs with returned values for each star per Av interval. Each panel shows the results of a certain method (A: COR 
- Universal Pattern, B: COR - EACF, C: OCT - PS(g)PS, D: OCT - PS(g)PS (bayesian), E: CAN) with run length 50, 200, 400, 600 
days in red, blue, cyan and black, respectively. Note that the 50 and 600 day curves in panel E overlap due to the fact that the 600 
day results were used to constrain the input for the 50 day runs. No results for 200 and 400 day long runs were obtained by CAN. 



This value 62 is then used to derive probability p, where p is the 
probability that in ax^ distribution with 2n dof a random variable 

250 is less than or equal to the cut off level specified. Finally we have 
the probability we sought which is pfi^i - 1 - p the probability 
that a given H/ will exceed the computed threshold 9. 

The recipe as described predicts that for all stars considered 
here (even for datasets as short as 50 days) we are likely to detect 

255 the oscillations. In general, the lower probabilities are at about 
93% likelihood for stars which have Vmax below 10 pHz. For one 
particular star the detection probability dropped to 75%. Note 
that these predictions are not sensitive to the shape of the oscil- 
lation power excess nor to any structure, such as the large fre- 

260 quency separation in it, and that we have taken the worst case 
scenario for the H/B of Helium-core-burning evolutionary sta- 
tus. These results are based on the integrated power of the oscil- 
lations. So from this test it a ppears t hat a d etection rate higher 
than 70% as quoted by Hek ker et al.l (1201 Idh would be expected 

265 when using the H/ B indications. But how does this compare with 
observational results from longer timeseries? We now consider 
this issue in the next section. 

5. Observational results 

For observed stars we do not know the true values of the seis- 
270 mic parameters. All that we can do is to estimate them using 
the observations. In order to obtain such estimates of the seis- 



mic parameters v^ax and Av, the COR and OCT methods are 
used to analyse the full timespan of just under 600 days of the 
complete set of stellar data. The analysis was 'blind' in that no 
manual checks were made on the outcomes. We therefore ex- 275 
pect to have some errors in the results. We did not use the CAN 
method because, for computational reasons (the multinest proce- 
dure is very time consuming), not all available stars were anal- 
ysed with it. For 974 stars there is close agreement between the 
results from OCT and COR for and (Av). In this context, 280 
close agreement is taken to be that the two completely indepen- 
dent methods identify the oscillations in the same region of the 
spectrum to within half the expected width of the envelope of 
the oscillatio n power, with the width of the oscillation envelope 
as defined bv lMosser et al.l {2010). Taking this relatively relaxed 285 
constraint is justified by the fact that we want to select a statis- 
tically significant sample of stars with oscillations detected by 
different methods in the same frequency range. For the remain- 
ing 54 stars, there are disagreements between the values obtained 
with the different methods. We inspected these stars by eye and 290 
for 39 stars the oscillations are at low frequencies (v < 5 //Hz), 
for four stars the oscillations straddle the Nyquist frequency and 
for 1 1 stars we do not have the standard red-giant oscillation 
spectrum due to the presence of artefacts or these could possibly 
be mis-classified as red giants. 295 

For the 974 stars for which there is agreement, we create 
reference values which are the mean values of v^ax and (Av), 
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respectively. These reference values are essentially an arbitrary 
zeropoint used to select reliable results and to discard outliers. 



300 5.1. Outlier removal in short datasets 

When short datasets are considered there will be occasions when 
the returned values are unreliable. We wish to remove some of 
these so that we can look at the spread in the reliable results. A 
very simple outlier rejection algorithm is used whose purpose is 

305 to reject patently wrong answers. This is the same as described in 
paper I and depends on comparing the refe rence value with the 
individual values. The results presented in IVerner et al.l (1201 ih 
suggest that for solar-type stars it is appropriate to use rejection 
criteria that scale with the v^ax of the star However, it was shown 

310 in paper I that this is not appropriate for red-giant stars. The 
process adopted here first rejects points that are more than 50% 
different from the reference value, irrespective of Vmax or <Av), 
and then applies an absolute cut. For these absolute cuts a value 
of 10 /iHz has been used for all but the low values of v^ax and 

315 a cut of 2 yuHz has been used for (Av). The cross-over position 
where the absolute cut off is more stringent than the relative one 
occurs at about v^ax = 20 yuHz. 

5.2. Is a data duration of 50 days enough to reliably detect 
the presence of modes? 

320 The statistical tests considered in Sect. 14.21 suggested that 
50 days of data were sufficient to reliably detect the presence 
of the oscillation power based on the height-to-background ra- 
tio. We can now see if that is true with the algorithms used. As 
we used only stars for which we had firm detections of oscil- 

325 lations in the 600-day dataset, we expected to have results for 
each of the 50-day runs, i.e. 12 results per star For runs of du- 
ration 200 days we expect to have 3 returns etc. This is not 
the case as can be seen from Fig. |2] where for each of the dif- 
ferent methods we plot the fraction of returns for the different 

330 data durations as a function of Av on a logarithmic scale. The 
data have been binned for this graph. In general the bin width 
used is just under 1 yuHz but bins are combined at high frequen- 
cies to improve the statistics where there are few stars in the 
original sample as can be seen in Fig. [1] As expected, as the 

335 run duration increases the general efficacy of each method im- 
proves. The exception to this is for the CAN method where the 
data from the long runs are used to constrain the fitted param- 
eter ranges in the short runs and the method is not 'blind' . The 
results are summarised in Table [T] Although all methods have 

340 difficulties at low frequency, the different methods are clearly 
somewhat different in the spectral regions where their response 
is reliable. Additionally, COR is less effective for mid-range fre- 
quencies. The detection capabilities of the EACF method under- 
Ue the two methods COR:UP and COR:EACF employed for the 

345 determination of ( Av). For this method the value o f the parame- 
ter Amax as given in lMosser & Appourchauxl(l2009l) is important. 
The threshold value set for a detection is 8 for rejecting the HO 
hypothesis at the 1% level. They have shown that the value of 
^max improves linearly with the duration of the dataset and so 

350 we expect a marked improvement as longer datasets are used. 
This is indeed the case as shown in Table [1] The peak detection 
methods underlying CAN does depend on a predefined list of 
stars, and hence this shows the distribution of the type of stars 
on the list used for the analysis presented here. The OCT method 

355 has issues at the very high frequencies. 



Table 1. Fraction of runs per star, for which results have been 
returned for (Av) as a function of timespan of the data, where 
12, 6, 3, 1 and 1 runs are available for data of 50, 100, 200, 400 
and 600 days length, respectively. 



method 


50 


100 


200 


400 


600 




days 


days 


days 


days 


days 


COR:UP 


0.62 


0.86 


0.97 


0.99 


1 


COR:EACF 


0.62 


0.86 


0.97 


0.99 


1 


OCT:PS®PS 


0.85 




0.94 


0.96 


1 


OCT:PS®PS (bayesian) 


0.55 




0.89 


0.96 


0.99 




40 100 500 

time [doys] 

Fig. 5. Median absolute deviations (MAD) observed for (Av) 
(top) and Vmax (bottom) for the different methods as a function of 
the timespan of the dataset. Colour coding the same as in Fig. [3] 
with red for OCT, green for COR:EACF black for COR: UP and 
blue for CAN. The 400 day results of COR:UP agree with the 
600 day results and hence the MAD is 0.000 and not shown. 
The dashed lines indicate linear fits through the data (with same 
colour-coding) in log-scale. See text for further details. 



It is important to note that all these algorithms rely not on 
detecting the presence of the oscillation power but instead they 
look for patterns in the spectrum that are the consequence of 
the regular spacing in the spectrum of the modes. In looking 
at the fractions of the stars for which we detect regular mode 360 
structure we are really considering a different measure from the 
H/B ratio-based derivations in Sect. 14.21 hence we are compar- 
ing two different strategies. In a datas et of 50 days the modes 
are barely resolved (Baudin et alJl201 ia .b) and so the amplitude 
of the mode in the spectrum is very variable. In fact the power 365 
varies as with 2dof, which means that the probability distri- 
bution of power is negative exponential and it is not unusual for 
a particular mode to be essentially absent. As the duration of the 
dataset increases and the modes become resolved this is less of 
a problem. From Table [T] we see that for timeseries of 100 days 370 
length we have just about 85% return and for 200 days long time- 
series about 95%, increasing to over 95% for 400 day datasets. 
The OCT:PS®PS(bayesian) is most sensitive to the timespan of 
the data and is only as reliable in detecting the oscillations as the 
other methods for timeseries of 400 days or longer. These tests 375 
suggest that in short datasets the height-to-background would be 
a more reliable method to detect oscillations as opposed to the 
currently developed methods based on the regularity of the fre- 
quency pattern. 
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Fig. 3. Normalised distribution of the offset of the individual results from the reference value, i.e., the result of the 600 day run of the 
same method for the same star (left), and normalised distributions of the uncertainties (centre) for (Av) for 50 (top), 200 (middle) 
and 400 day (bottom) datasets. COR: UP, COR:EACF, CAN and OCT results are indicated in black solid lines, green dashed-dotted 
lines, blue dashed-triple dotted lines and red dashed lines respectively. The right column shows the normalised distribution of the 
offset of the individual results divided by its stated uncertainty for data of 50 (top), 200 (middle) and 400 days (bottom) length. 
Colours and linestyles are the same as in the left panels. 



380 So the simple answer to the question posed at the begin- 
ning of this section is 'no, 50 days is not enough to be cer- 
tain to pick up more than 90% of the oscillations with the cur- 
rently employed methods, but with methods based on height-to- 
background it is predicted that it would be possible to obtain 

385 reliable results in such short data-sets.' 

5.3. Dependence ofv^nx, <Av> and their quoted uncertainties 
on the length of the timeseries 

We have looked at the likelihood of the modes being detected in 
datasets of differing lengths but there is another important con- 
390 sideration. Here we consider the precision of these results by 
comparing them with reference values. Because all methods use 
slightly different definitions for v^ax and (Av) and we first aim 
to investigate the influence of the timespan only, we use the re- 
sults of the 600 day run of a particular method as the reference 



to compare results of the shorter runs of that same method with. 395 
We evaluate both the deviation of the returned values from the 
reference values and the quoted uncertainty on the value. 

We first explore how the deviations from the reference value 
and the uncertainties compare for the different data durations. 
The left panels of Figs. [3]and|4]show the distribution of the de- 400 
viation of the individual results from their respective reference 
values for each of the global parameters considered here for data 
with a timespan of 50, 200 and 400 days for the range of meth- 
ods employed. The different timespans are shown in different 
rows and the different methods are plotted in different colours 405 
with different line styles. The left hand panels of Figs. [3] and |4] 
show that except for the measure of (Av) by COR: UP the spread 
in the difference decreases with increasing timespan of the data. 
The reason for the difference in behaviour of COR: UP originates 
from the fact that this method applies the additional constraint 410 
of a regular pattern on the spectrum. The decrease of the spread 
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Fig. 4. Same as Fig. |3] but now for v„ 
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with increasing timespan raises the question whether we can ex- 
pect further improvements from even longer datasets. Therefore, 
we show the spread as a function of timespan in Fig. |5] The 
spread for COR:UP is 0.000 at 400 days (not shown) and this 
method is very reliable at determining the (Av) even for short 
datasets. The decreasing trend of the spread in the global os- 
cillation parameters for longer timeseries of the other methods 
suggests that longer datasets would still improve the precision 
of the obtained parameters. To investigate this further we show 
linear fits in log-scale through the MAD values of each method. 
When extrapolating these fits to 2000 days (~5.5 yrs, which is 
the current predicted length of the mission), this would imply a 
reduction in the MAD of at least a factor of 10 (for (Av) factors 
of 23, 1 1 and 20 for COR:UP, COR:EACF and OCT respectively 
and for v^.^„ factors of 10 and 14 for COR:EACF and OCT). In 
addition to the spread in the results we also checked for potential 
biases. It is noticeable that the offsets are not zero even though 
the method is its own reference. These biases are more clearly 
visible in the right hand panels where we show the distribution 
of the offsets divided by the quoted uncertainty (cr) expressed in 
dimensionless units. 



We now turn to the uncertainties reported by the different 
methods. The normalised distributions of these uncertainties are 
shown in the central columns of Figs.|3]and|4] Again we can see 
that for some run durations, the different methods produce sim- 
ilar uncertainties and for others they differ. An important con- 
sideration is the validity of the uncertainties as a guide to the 
reliability of the returned results. To this end, in the right hand 
column we show the distribution of the offsets divided by their 
individual quoted uncertainties expressed in dimensionless units. 
In case of statistically reliable quoted uncertainties we would ex- 
pect the distributions to have a width of + 1 o" at half maximum. 
In case of a wider distribution the uncertainties are underesti- 
mated and a more narrow distribution indicates overestimated 
uncertainties. For (Av) we see that OCT and CAN provide real- 
istic uncertainties for runs of 50 day lengths, although the tails 
of the distribution of OCT are well-populated. Both methods of 
COR seem to overestimate the uncertainties. The banded nature 
of the COR:UP results is a byproduct of the method used to find 
the peak in the autocorrelation function. For longer datasets all 
methods seem to overestimated the uncertainties to a certain ex- 
tend. Similar conclusions can be drawn for the results of v^ax in 
the right hand panels of Fig. |4] 
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Fig. 6. Left: Uncertainties (open diamonds: 50 days, dashed line: 400 days) and mean absolute deviations multiplied by 0.8 (see text, 
filled diamonds: 50 days, solid line: 400 days) in (Av) as a function of (Av) for results from the different methods: COR: UP (panel 
A), COR:EACF (panel B), OCT:PS(g)PS (panel C) and CAN (panel D). Right: Uncertainties (open diamonds: 50 days, dashed line: 
400 days) and mean absolute deviations (filled diamonds: 50 days, solid line: 400 days) in Vmax as a function of Vmax for results from 
three different methods: COR (panel E), OCT (panel F) and CAN (panel G). 



The measures described above do however average over the 
frequency range at which the oscillations occur and the uncer- 
tainty might be expected to be a function of frequency. Fig. |6] 
shows the frequency dependence of the mean uncertainty and 
the median absolute deviation (MAD) for several methods. For a 
Gaussian distribution (white noise), the typical ratio of root me- 
dian square deviation to MAD is roughly 0.8. So we multiply the 
MAD by 0.8 in order to compare it with the typical uncertainty. 
The left hand column is for (Av) and the right hand column is for 
Vmax- Each graph in the figure corresponds to a different method 
and allows us to illustrate how the deviations (MAD) and un- 



For COR:EACF, at 50 days the uncertainties are over-estimated. 
However, at 400 days the uncertainties and MAD have reduced 
and are more closely in agreement except for the highest fre- 
quencies where there are not many stars. For OCT:PS®PS, at 50 
days the uncertainties are underestimated at low and medium fre- 
quencies with the agreement steadily improving as the frequency 
increases. At 400 days, the uncertainties are progressively over 
estimated. The determination of values as illustrated by the value 
of MAD improves in the longer datasets. Finally, we consider 
CAN. Although, the trends for 50 day results are very similar 
the uncertainties are slightly overestimated. 



475 



480 



485 



455 



460 



465 

certainties correspond for a given method at the longest and the 
shortest data duration, i.e., 400 and 50 day long datasets. 

It is clear that although there is some consistency in the 
curves for any one method, the frequency dependencies of the 
470 uncertainty and of the deviation are not identical. We now dis- 
cuss each method in turn starting with (Av). For COR:UP, we 
see again that the results for 50 or 400 day long timeseries are 
remarkably similar. Significant improvement can only be seen 
at low frequencies. The uncertainties seem to be overestimated. 



Just three methods are used for v^ax- For OCT and CAN 
there is general agreement between MAD and uncertainty with 
a slight tendency for the uncertainties to be over estimated. The 
uncertainties for COR:EACF are overestimated by roughly a fac- 
tor of two to three. 490 

Additionally, for all methods the variation of MAD with fre- 
quency is not strong and supports our earlier assumption for 
the outlier rejection to use a fixed threshold independent of fre- 
quency. 
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495 5.4. Offsets between different methods 



In the previous subsection we saw that within any one given 
method, short datasets can give, on average, slightly biased re- 
sults when compared with longer sets. Here we concentrate on 
the differences between different methods using the results ob- 

500 tained with 600 days of data. We know that the different meth- 
ods involve different assum ptions and no method i s without as- 
sumptions as is shown by Kallinge r et al.l (|2012|) . Two meth- 
ods can be considered to lie at extreme ends of the choices for 
how to measure (Av). At one extreme is CAN which uses indi- 

505 vidual peak bagging to measure two values of Av close to the 
peak of the oscillation power and returns their average as (Av). 
At the other end of the choice is COR: UP, which imposes a 
regular pattern on the whole spectral range and returns a (Av) 
based on that. It is known that variation of the large separation 

51 with frequency is dependent on the evolutionary state of the star 
dKallinger et al. 2012) and this is seen very clearly if the val- 
ues of (Av) from CAN and COR:UP are compared (see Fig.|7]). 
Indeed the COR: UP show a bimodal distribution with respect to 
the CAN results, in which the left peak are predominantly RC 

515 stars and the rightmost peak are RGB stars. Following the rea- 
soning of ( Kallinger et al. 2012), this clear difference between 
(Av) could even be used to classify whether a star is already in 
its He-core burning phase. For the other methods the differences 
follow the same pattern, but are not as clear because, firstly they 

520 are in between CAN and COR: UP in terms of their global / lo- 
cal approach and secondly CAN and COR: UP are not particular 
sensitive to realization noise. For COR: UP this is because of the 
regularity constraint and for CAN it is due to the fact that the 
frequency determination of a given peak is relatively insensitive 

525 to the realization noise given the long datasets. 

For Vmax, effects are less pronounced. OCT agrees well with 
CAN but still with a (small) difference between RGB and RC. 
This could be due to difference in the acoustic cutoff frequency 
and / or differences in the smoothing applied to fit the power ex- 

530 cess. lMosser et alj (|2012|) investigated this in detail and showed 
that smoothin g can have a non-negl igible effect (also already 
pointed out bv lKallinger et al.l (l2010al) ). Furthermore, they show 
that clump stars have oscillations with lower amplitudes, but 
larger Vmax, than stars ascending the red-giant branch with simi- 

535 lar values for (Av). 

This comparison of results of long datasets obtained with 
different methods shows that the definition of the obtained pa- 
rameter is of importance and that the differences in the defini- 
tion are significantly larger than the observational uncertainties. 

540 Hence it is important when quoting a parameter value to provide 
the detailed definition of that particular parameter. Note that all 
methods also differ in their sensitivity to realization noise as al- 
ready seen in paper I. 

5.5. Comparison between tiie predicted and observed mode 
545 H/B 

For each star analyzed, a value for the envelope height and the 
noise background at v,j,ax are returned. We have certain expecta- 
tions for the values. We expect, on average, the ratio of these two 
parameters to be have a value of about 3.7 or 4.0 de pending on 

550 the evolutionary status of the star dMosser et al ]l20r2) . From the 
same work we know that within factors of order unity the values 
returned by different methods will not be entirely consistent. In 
this section we explore how closely the expectations are met. We 
also look at how the ratio varies from run to run particularly for 

555 the short runs in order to evaluate whether this is a significant 
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Fig. 7. Normalised distribution of the offset of the individual 600 
day results from the reference value, i.e., the CAN 600 day re- 
sults, for (Av) (top) and v^ax (bottom). COR:UP, COR:EACF 
and OCT results are indicated in black solid lines, green dashed- 
dotted lines and red dashed lines respectively. 



factor in the non-detection of the oscillations. For the longest 
available dataset of 600 days, the median value of the observed 
H/B is 4.1 with inter-quartile distance of 1.4 which is roughly 
consistent with the expectations. 

Next we turn to a consideration of the 50-day data. Here 560 
we find that on average the returned envelope height and noise 
background are consistent with the figures for the longer runs. 
However, this masks a large amount of variability. The apparent 
height of the envelope is very variable. We do not know if this 
is genuine variability or a defect in the algorithms. However, it 565 
is clear that even with height-to-background ratios significantly 
below unity, detection of the modes is possible thanks to the reg- 
ular pattern of the oscillations. We do not have the values where 
the algorithms failed to find evidence for oscillations and so can- 
not comment on the height-to-background ratio in these cases. 570 



6. Prediction of Vmax from rms flux 

An automated analysis of the red giant data is made more diffi- 
cult by the fact that for some of the largest giants the peak in the 
mode power is at very low frequency (below ~ 5 yuHz). Unless 
the datasets are very long, the spectra do not have enough resolu- 575 
tion to clearly distinguish the oscillations. The automated algo- 
rithms may then fasten on features at other frequencies and thus 
provide a false positive detection. We therefore have sought an 
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Fig. 8. Variance of the flux as a function of v^ax, with RGB, RC, 
second clump and AGB stars indicated by black asterisks, red 
diamonds, green triangles and blue crosses, respectively. Fits to 
the values of the four evolutionary states are shown by the yellow 
solid line, the green dashed line, the red dashed-dotted line and 
light-blue dashed-triple dotted line. The prediction from Eq.fTOl 
is indicated with the gray line. 

Table 2. Coefficients of the fit: Y - a\^^^-^, with V the variance of 
the flux in ppm^ and Vmax the frequency of maximum oscillation 
power in /vHz, for different evolutionary phases. 







a 


b 




RGB 


2.4 X 10' 


+ 1 X 10^ 


-1.18 + 


0.01 


RC 


7x 10* 


t 2 X 10** 


-2.13 + 


0.07 


second clump 


1.1 X 10' 


+ 5 X 10* 


-1.1 + 


0.1 


AGB 


4.1 X 10' 


± 7 X 10* 


-1.53 ± 


0.08 



independent parameter to guide the software to the appropriate 
580 region. We have found that the mean flux variance in the time- 
series data is one such guide. We first provide an analysis which 
shows why this should be so and then provide the data to illus- 
trate the dependence that we find. 

Parseval's theorem states that the variance of the timeseries is 
585 equal to the integrated power in the spectrum. We therefore look 
at the sources of power in the spectrum. At very low frequencies, 
instrumentation effects will become important. To some extent 
this has been removed from the data considered here by the data 
preparation algorithms. At all frequencies there is photon shot 
590 noise, but the red giants are usually sufficiently bright that it can 
be neglected. As a consequence, for red giants the major sources 
of the signal in the data are the granulation and the oscillations. 
The mode power is modelled as a Gaussian of height H and full 
width half power c^env hence the total power in ffie modes is 

595 -Pmode_total = :7 a/'; — —H6^n\- (2) 

2 V In 2 

Using iMosser et al. we can express both the height at 

maximum and the width of the distribution as a function of the 
frequency of maximum power: 



1.4xl0^y„'i. 



(3) 



600 The frequency distribution of the power in ffie granulation is 
modelled according to the Harvey prescription: 



gran 



(v) 



4rr^ T 
^" int ' gra" 

1 -I- (2;rTgranV)^ ' 



(4) 



where variance in ffie timeseries of the granulation is crr^j and 
Tgran IS ffie timcscalc of the granulation. We can use this to esti- 
mate the power, B, in the granulation signal at Vn,ax- At Vmax the 605 
factor of unity in the denominator can be neglected: 



B 



X T„ 



(5) 



' gran ^ max 

From lMaffiur et all (1201 ih we have that t, 
B 



gran ~ 0.7v„"ax , hence 



0.77r2y^L' 



(6) 



Knowing that H - 2.03 x lO^Vj^ax** we can use the observation 610 
that the ratio of height to background is a constant of value 3.7 
to 4 depending on the evolutionary state of the star: 



H 



14 X lO^y-l? 



(7) 



Knowing HjB we can now estimate a value for al^^ Thus the 
total variance (V) in the timeseries is 



^ — C^int ^ ^modejotal 

14 X I0^;;,lf 



V 



HjB 



+ 1.4xlOVax 



ppm 



A typical value for HjB is about 4, hence 
y = (3.5v,„'af + l-4v„,ix)xl0^ppml 



(8) 
(9) 

(10) 



615 



620 



625 



It is clear that although the power law indices of v^ax in the two 
components of the noise are not the same, they are relatively 
close to each other The granulation provides just over twice the 
amount of power as do the modes. 

Observationally we get 2.4 x 10^v,„a,,'^ ppm^ for RGB stars 
(see fit in Fig. |8]l. We see that for other evolutionary states the 
fits have different coefficients (Table|2]i. This indicates that there 
are differences in either the granulation description and / or the 
height and width ratio of the oscillation power as a function 
of evolution phase. This is consistent with what is shown by 
iMosser et al ] (120 12h . and needs further investigations which is 630 
beyond the scope of this paper 

7. Summary 

In this work we investigated the impact of the lengffi of the time- 
series on the precision and accuracy of the determined global os- 
cillation parameters v^ax and (Av) of red giants. We used Kepler 
light curves spanning about 600 days and divided them in short 
runs of 50, 100, 200 and 400 days. All these runs have been anal- 
ysed using automated methods. The oscillation detection rate has 
been compared with predictions and the resulting values for the 
global oscillation parameters have been compared as a function 
of method, run length, (Av> of the oscillations. From this study 
we find that: 



For 95% of the stars consistent global oscillation parameters 
are obtained from 600 day timeseries with different methods. 
For the remaining 5%, there were good reasons for the lack 
of consistency. 

Using the observational methods we find more than 95% (of 
the consist results of 600 day data) or more reliable detec- 
tions of oscillations in timeseries of 400 days or longer. 



635 



640 



645 
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650 - Current predictions of the detectability of oscillations are 
based on the amplitudes and predict that in the majority of 
the cases the likelihood to detect oscillations are above 90% 
for both the long and short runs. However, most observa- 
tional algorithms use the regularity in the power spectrum to 

655 detect the oscillations and the regularity has reduced sensi- 
tivity for shorter runs. 

- The precision of the determined global oscillation parame- 
ters increases with increasing timeseriess and the trends sug- 
gest that this continues for even longer timeseries than inves- 

660 tigated here. From the extrapolation of fits to the median ab- 
solute deviations a reduction of more than a factor of 10 for 
an increase in timespan from 50 to 2000 days (the currently 
foreseen length of the mission) is foreseen. Thus, there are 
real advantages to be gained from working with even long 

665 timeseries than considered here. We note that the universal 
pattern is already effective for short datasets. 

- The distributions of the offsets - difference between results 
of short runs with respect to the result obtained with the 
same method on the 600-day long timeseries - divided by the 

670 quoted uncertainties show that the quoted uncertainties have 
a tendency to be overestimated, which is in general more se- 
vere for longer datasets. However, this does depend on the 
method. 

- We find that 50 day timeseries are not long enough to be 
675 certain to pick up more than 90% of the oscillations with the 

currently employed methods. 

- When comparing different methods it is clear that the dif- 
ferences due to different definitions are non-negligible. This 
difference is a function of the evolutionary state of the stars 

680 and this could be used to determine the evolutionary state. 

- The different strengths, definitions and sensitivity to realiza- 
tion noise of the different methods indicate that the simulta- 
neous use of more methods is Ukely to be profitable. 

Additionally, we propose and justify a new method to es- 
685 timate the frequency of maximum oscillation power from vari- 
ance in the timeseries. We show that the dependence of the flux 
variance on v^ax is also a function of evolutionary phase. The 
effectiveness of this method does not depend on the data dura- 
tion nor on the location of the peak of the spectrum - always 
690 assuming that the necessary data detrending is not attenuating 
the oscillations signal. We recommend that this method be used 
in conjunction with the methods described here as an additional 
independent constraint to detect the oscillations. 
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